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ABSTRACT 

The observations in many applications consist of counts of dis- 
crete events, such as photons hitting a dector, which cannot be effec- 
tively modeled using an additive bounded or Gaussian noise model, 
and instead require a Poisson noise model. As a result, accurate 
reconstruction of a spatially or temporally distributed phenomenon 
(/) from Poisson data (y) cannot be accomplished by minimizing a 
conventional £2 — ii objective function. The problem addressed in 
this paper is the estimation of / from y in an inverse problem set- 
ting, where (a) the number of unknowns may potentially be larger 
than the number of observations and (b) / admits a sparse approx- 
imation in some basis. The optimization formulation considered in 
this paper uses a negative Poisson log-likelihood objective function 
with nonnegativity constraints (since Poisson intensities are natu- 
rally nonnegative). This paper describes computational methods for 
solving the constrained sparse Poisson inverse problem. In particu- 
lar, the proposed approach incorporates key ideas of using quadratic 
separable approximations to the objective function at each iteration 
and computationally efficient partition-based multiscale estimation 
methods. 

Index Terms — Photon-limited imaging, Poisson noise, wavelets, 
convex optimization, sparse approximation, compressed sensing 

1. INTRODUCTION 

In a variety of applications, ranging from nuclear medicine to night 
vision and from astronomy to traffic analysis, data are collected by 
counting a series of discrete events, such as photons hitting a detector 
or vehicles passing a sensor The measurements are often inherently 
noisy due to low count levels, and we wish to reconstruct salient 
features of the underlying phenomenon from these noisy measure- 
ments as accurately as possible. The inhomogeneous Poisson pro- 
cess model [11 has been used effectively in many such contexts. Un- 
der the Poisson assumption, we can write our observation model as 

y ~ Poisson(A/), (1) 

where / G K+ is the signal or image of interest, A e K^|'^™ lin- 
early projects the scene onto a TV-dimensional set of observations, 
and y G Z+ is a length- A'^ vector of observed Poisson counts. 

The problem addressed in this paper is the estimation of / from 
3/ in a compressed sensing context, when (a) the number of un- 
knowns, m, may be larger than the number of observations, A'^, and 
(b) / is sparse or compressible in some basis W (i.e. / = W6 
and 6 is sparse). This challenging problem has clear connections 
to "compressed sensing" (CS) (e.g., |2|), but arises in a number of 
other settings as well, such as tomographic reconstruction in nuclear 
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medicine, superresolution image reconstruction in astronomy, and 
deblurring in confocal microscopy. 

In recent work |3|, we explored some of the theoretical chal- 
lenges associated with CS in a Poisson noise setting, and in partic- 
ular highlighted two key differences between the conventional CS 
problem and the Poisson CS problem: 

1 . unlike many sensing matrices in the CS literature, the matrix 
A must contain all nonnegative elements, and 

2. the intensity /, and any estimate of /, must be nonnegative. 

Not only are these differences significant from a theoretical stand- 
point, but they pose computational challenges as well. In this paper, 
we consider algorithms for computing as estimate / of / from ob- 
servations y according to 

/ = argmin-log p(i/ U/ j +pen(/) 
/er VI/ 

JV 

= argmin^ | (^Af^ - y, log [(^A/) ] j + pen(7), 
/er ' ' 

where 

• p(^y 1^/) is the Poisson likelihood of y given intensity Af, 

• (^Af ^ is the i"^ element of the vector Af, 

• r is a collection of nonnegative estimators such that / > for 
all/er, and 

• pen(/ ) is a penalty term which measures the sparsity of / in 
some basis. 

This paper explores computational methods for solving |2](. The 
nonnegativity of / and A results in challenging optimization prob- 
lems. In particular, the restriction that / is nonnegative introduces 
a set of inequality constraints into the minimization setup; as shown 
in |4|, these constraints are simple to satisfy when / is sparse in 
the canonical basis, but they introduce significant challenges when 
enforcing sparsity in an arbitrary basis. We will consider several 
variants of the penaly term, including ||/||i, || for some arbi- 

trary orthonormal basis W, and a complexity regularization penalty 
based upon recursive dyadic partitions. We refer to our approach as 
SPIRAL (Sparse Poisson Intensity Reconstruction ALgorithm). 

The paper is organized as follows. We first consider related op- 
timization methods for solving the compressed sensing problem in 
Sec.|2] Next, in Sec. [3] we consider three methods for solving the 
constrained penalized Poisson likelihood estimation problem with 
different penalty terms. The first two approaches use an £1 penalty 
while the third uses partition-based penalty functions. Experimen- 
tal results comparing the proposed approaches with expectation- 
maximization (EM) algorithms are presented in Sec.|4] 



2. SPARSITY-REGULARIZED OPTIMIZATION 



WO > 0), the minimization problem 



Without any nonnegativity constraints on /, the Poisson minimiza- 
tion problem (|2| can be solved using the SpaRSA algorithm of 
Figueireido et al. |5 1, which solves a sequence of minimization prob- 
lems using quadratic approximations to the log penalty function: 



/ 



fc + l A_ 



argmin ^ ||/ - s'^Ha + — pen(/), 

/6E"» ^ Oik 



where 



(2) 



(3) 



ak is a positive scalar chosen using Barzilai-Borwein (spectral) 
methods (see [5J for details), and 

F{f)^-\ogpiy\Af) 

is the negative log Poisson likelihood in l[2]l. If the penalty term 
in |2| is separable, i.e., it is the sum of functions with each func- 
tion depending only on one component of /, then (|2| can be solved 
component-wise, making the problem (j2]l relatively easy to solve. 

Much of existing CS optimization literature (e.g., (6l|7][8l) fo- 
cuses on penalty functions where pen(/) oc where W is 
some orthonormal basis, such as a wavelet basis, and 9 = W^f are 
the expansion coefficients in that basis. Also, nonnegativity in the 
signal / = W9 > is not necessarily enforced. In this paper, we 
develop three approaches to solving the optimization problem in (|2j. 
In all three approaches, we require / to be nonnegative. The first ap- 
proach assumes that the image is sparse in the canonical basis, i.e., 
W = I, while the second involves the more general orthogonal ma- 
trix W. The third uses partition-based denoising methods, described 
in detail in Sec. 13.31 



3. NONNEGATIVE REGULARIZED LEAST-SQUARES 
SUBPROBLEM 

3.1. Sparsity in the canonical basis 

Let pen(/) = tH/Hi, where r > is a regularization parame- 
ter. The minimization problem |2| with this penalty term and that is 
subject to nonnegativity constraints on / has the following analytic 
solution: 



ak 



where the operation [ • ]+ ~ max{ • , 0} is to be understood 
component-wise. Thus solving (j2} subject to nonnegativity con- 
straints with an li penalty function and with a solution that is sparse 
in the canonical basis is straightforward to solve. An alternative al- 
gorithm for solving this Poisson inverse problem with sparsity in the 
canonical basis was also explored in the recent literature |4J. 

3.2. Sparsity in an arbitrary orthonormal basis 

Now suppose that the signal of interest is sparse in some other basis. 
Then the £i penalty term is given by 

pen(/)^r||M/7||i=rje||i, 

where 9 = Wf for some orthonormal basis W, and r > is some 
scalar. When the reconstruction / — WO must be nonnegative (i.e., 
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ak 



subject to W9 > 



(4) 



no longer has an analytic solution necessarily. We can solve this 
minimization problem by solving its dual. First, we reformulate 
l|4| so that its objective function <}>^{9) is differentiable by defin- 
ing u,v > Q such that 9 = u — v. The minimization problem Q 
becomes 



fc + l fe + lN 
V 



■ arg min — w - 



S II2 H 1 {u + v) 

ak 



subject to u,v > 0, W{u — > 0, 



(5) 



which has twice as many parameters and has additional nonnegativ- 
ity constraints on the new parameters, but now has a differentiable 
objective function. The Lagrange dual problem associated with this 
problem is given by 



minimize /i(A,7)= lls*^ + 7 + M^^^-^lli 



1 II fell 2 

II2 



subject to A>0, - — 1<7<— 1 
ak ak 

and at the optimal values 7* and A*, the primal iterate 9 
u^^^ ~ is given by 



(6) 



fe+l A 



We note that the minimizers of the primal problem ^ and its dual 
l|6j satisfy (f>'{9^'^^) — — ^1(7*, A*) since ijs} satisfies (a weakened) 
Slater's condition |9|. In addition, the function — ft(7, A) is a lower 
bound on (p'' (9) at any dual feasible point. 

The objective function of |6]l can be minimized by altematingly 
solving for A and 7, which is accomplished by taking the partial 
derivatives of h{X, 7) and setting them to zero. Each component is 
then constrained to satisfy the bounds in At the j**^ iteration, the 
variables can, thus, be defined as follows: 



7«' = mid - — l,-s*=-Vy^A(^-i\— 1 
ak ak 



A*^' = [-l^(s'=+7(^))]^, 



where the operator mid(a, b, c) chooses the middle value of the three 
arguments component-wise. Note that at the end of each iteration 
j, the approximate solution g'-'' = + 7'^' + W^A*^' to is 
feasible with respect to the constraint W9 > 0: 



W9 



= Ws" + W-y'-" + A 



U) I \U) 



+ 1 



> 0. 



Unfortunately, there are several important disadvantages associ- 
ated with the above approach. First, the solution to the subproblem 
must itself be computed by an iterative algorithm - so while the prob- 
lem may be solvable, it must by solved at each iterate, and the result- 
ing overall algorithm may not be fast, particularly for large problems. 
Furthermore, since we rely on an iterative solver in this subproblem, 
our computed solution may not be the true optimal point, particu- 
larly if we use a gentle convergence requirement to limit computa- 
tion time. Further study is needed to understand the impact of com- 
puting inaccurate solutions to this subproblem on the performance 
of the overall algorithm. 



3.3. Partition-based penalties 

In the special case where the signal of interest is known to be smooth 
or piecewise smooth in the canonical basis (i.e. is compressible in 
a wavelet basis), we can formulate a penalty function which is a 
useful alternative to the l\ norm of the wavelet coefficient vector. 
In particular, we can build on the framework of recursive dyadic 
partitions (RDP), which we summarize here and are described in 
detail in 1 10 1 1 1. Let V be the class of all recursive dyadic partitions 
of [0, 1] where each interval in the partition has length at least 1/m, 
and let P G P be a candidate partition. The intensity on P, denoted 
f{P), is calculated using a nonnegative least-squares method to fit a 
model (such as a constant or polynomial) to s'' in ([s} on each interval 
in the RDP. Furthermore, a penalty can be assigned to the resulting 
estimator which is proportional to |P|, the number of intervals in P. 
Thus we set 

P = argmin^||/(P)-/-||^ + r|Pj 
f = fiP)- 

A search over V can be computed quickly using a dynamic pro- 
gram. When using constant partition interval models , the nonnega- 
tive least-squares fits can be computed non-iteratively in each inter- 
val by simply using the maximum of the average of s'' in that interval 
and zero. Because of this, enforcing the constraints is trivial and can 
be accomplished very quickly. 

It can be shown that partition-based denoising methods such as 
this are closely related to Haar wavelet denoising with an impor- 
tant hereditary constraint placed on the thresholded coefficients — if 
a parent coefficient is thresholded, then its children coefficients must 
also be thresholded 1,101 . This constraint is akin to wavelet-tree ideas 
which exploit persistence of significant wavelet coefficients across 
scales and have recently been shown highly useful in compressed 
sensing settings 1 12|. 

As we mention in Sec.|4]below, using constant model fits makes 
it easy to satisfy nonnegativity constraints and results in a very 
fast non-iterative algorithm, but has the disadvantage of yielding 
piecewise constant estimators. However, a cycle-spun translation- 
invariant version of this approach can be implemented with high 
computational efficiency |T0| and be used for solving this nonneg- 
ative regularized least-squares subproblem that results in a much 
smoother estimator. 

4. NUMERICAL SIMULATIONS 

We evaluate the effectiveness of the proposed approaches in re- 
constructing a piecewise smooth function from noisy compressive 
measurements. In our simulations, the true signal (the black line 
in Figs. [T|a-c)) is of length 1024. We take 512 noisy compres- 
sive measurements of the signal using a sensing matrix that con- 
tains 32 randomly distributed nonzero elements per row. This setup 
yields a mean detector photon count of 50, ranging from as few 
as 22 photons, to as high as 94 photons. We allowed each algo- 
rithm a fixed time budget of three seconds in which to run, which 
is sufficient to yield approximate convergence for all methods con- 
sidered. Each algorithm was initialized at the same starting point, 
which was generated using a single E-step of the EM algorithm. 
That is if z — A^y, and x : Xi — yi/{Az)i, then we initialize 
with : ff = Zi{A'^ x)i/ {A^V}i. The value of the regularization 
parameter r was tuned independently for each algorithm to yield the 
minimal root-mean-squared error RMS = ||/ — /||2/||/||2at the 
exhaustion of the computation budget (see Table[TJ. 



Reconstruction Method RMS 



SPIRAL (TI) 


0.1427 


SPIRAL (TV) 


0.1855 


EM-MPLE (TI) 


0.2485 


EM-MPLE (TV) 


0.2511 


SPIRAL (£i) 


0.2898 



Table 1. Reconstruction accuracy as measured by the root-mean- 
squared error, RMS = ||/ — /II2/II/II2. 

An examination of the results in Figs.[TJa-c) reveals that even 
though models within a partition (constant pieces) are less smooth 
than higher-order wavelets, this drawback is neutralized through a 
combination of cycle spinning (TI), the hereditary constraint, and a 
fast, non-iterative solver We compare our algorithm with an EM 
algorithm based upon the same maximum penalized likelihood esti- 
mation (MPLE) objective function proposed in |2](, which has been 
used previously in imaging contexts in which sparsity in a multiscale 
partition-based framework was exploited |[T3|. Although Fig. ^d) 
shows that the convergence behavior of the EM-MPLE approaches 
is more stable, their slow convergence ultimately hinders their per- 
formance as compared with the corresponding SPIRAL-based ap- 
proaches. In the case of SPIRAL-i*!, the estimate seems very os- 
cillatory; a smoother estimate could be achieved by increasing the 
regularization parameter r, but this leads to an "oversmoothed" so- 
lution and increases the RMS of the estimate. The large RMS val- 
ues early in the SPIRAL iterations are due to small values of ak 
when defining in jsj, which occur when the estimates are flat and 
the consecutive iterates only change slightly. The Barzilai-Borwein 
method |7 |, on which the choice of ctk is based, is not monotone and 
can exhibit spikes in the iterates' RMS values (the non-monotone 
SpaRSA algorithm is also not immune to this behavior, for exam- 
ple). Although it is difficult to characterize the convergence of this 
approach, it has been shown to be effective in solving minimiza- 
tion problems (see | 7 | for details). The SPIRAL partition-based es- 
timates appear to oscillate in steady state. This effect could be mit- 
igated by forcing the objective function to decrease monotonically 
with iteration, as described in |5| and as in EM algorithms. How- 
ever, empirically this appears to hinder performance, as convergence 
is significantly slowed. We are currently investigating methods that 
use a non-monotonic approach to obtain a good approximation to the 
minimizer, followed by a monotonic method to enforce convergence. 

5. CONCLUSION 

We have developed computational approaches for signal reconstruc- 
tion from photon-limited measurements — a situation prevalent in 
many practical settings. Our method optimizes a regularized Poisson 
likelihood under nonnegativity constraints. We have demonstrated 
that these methods prove effective in the compressed sensing context 
where an li penalty is used to encourage sparsity of the resulting so- 
lution. Our method improves upon current approaches in terms of 
reconstruction accuracy and computational efficiency. By employ- 
ing model-based estimates that utilize structure in the coefficients 
beyond that of a parsimonious representation, we are able to achieve 
greater accuracy with less computational burden. Future work in- 
cludes supplementing our algorithms with a debiasing stage, which 
maximizes the likelihood over the sparse support discovered in the 
main algorithm. We also will be exploring the efficacy of alternative 
optimization approaches such as sequential quadratic programming 
and interior-point methods. 
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Fig. 1. Reconstruction results using(a) the SPIRAL algorithm with translationally variant (TV) and translationally invariant (Tl) partitions, 
(b) the SPIRAL algorithm with an l\ penalty, and (c) the EM-MPLE algorithm with translationally variant (TV) and translationally invariant 
(Tl) partitions, (d) mean-squared error decay as a function of compute time. 
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